PRESENTING JACOB THOMPSON INDEPENDENT SECURITY EVALUATORS.
GIVE THEM A HAND.
OKAY.
THANKS FOR THE INTRODUCTION THERE.
THIS TALK IS CALLED CREAM FOR CASH RULES EVIDENTLY AMBIGUOUS AND MISUNDERSTOOD.
SO THE ROUTE HERE IS WEBSITES USE HTTPS BECAUSE CONTENT IS SENSITIVE, LIKE AN ONLINE BANKING
APPLICATION, CREDIT CARD STATEMENTS, PAYROLL INFORMATION AND SO ON.
THE REASON THEY USE HTTPS IS BECAUSE THAT DATA IS TOO SENSITIVE TO TRANSFER OVER AN
OPEN NETWORK WITHOUT ENCRYPTION.
AND IT ONLY COMES FROM THERE THAT IF IT'S TOO SENSITIVE TO BE SENT OVER THE NETWORK
WITHOUT PROTECTION, THEN MAYBE IT SHOULDN'T BE WRITTEN TO DISK WITHOUT ENCRYPTION EITHER,
ESPECIALLY WITHOUT THE USER'S KNOWLEDGE.
AND IN THE PAST, MANY WEB BROWSERS WERE CAUTIOUS ABOUT PERSISTENTLY CASHING INTO
INFORMATION JUST BASED ON THE FACT THAT IT CAME OVER AN HTTPS CONNECTION, WHETHER THE
HEADERS SAID TO CASH IT OR NOT.
AND JUST TO CLARIFY HERE, I'M NOT CONCERNED ABOUT MEMORY CASHING, BUT ONLY PERSISTENT
CASHING TO DISK, AS IN YOU CLOSE THE BROWSER, IT'S STILL THERE.
SO I ACTUALLY WAS OFFLINE AT ONE POINT AND BORED, SO I OPENED MY COPY OF FIREFOX AND
NOT HAVING ANYTHING TO READ, I WENT TO THE DISK CASH.
ALL RIGHT.
SO I WAS VERY SURPRISED TO SEE THINGS THERE FROM MY BANK, LIKE CHECK IMAGES AND ACCOUNT
SUMMARIES, RECENT TRANSACTIONS.
SO MY IMPRESSION WAS THAT BROWSERS DID NOT CASH THIS INFORMATION.
SO AFTER THAT, WE LOOKED AT 30 SITES AT ISE, ALONG WITH SOME OF THE OTHER ANALYSTS AT
ISE WHO HELPED ME.
AND WE FOUND THAT 21.
21 OF THEM WERE CAUSING INFORMATION TO BE PERSISTENTLY CASHED IN THE LATEST BROWSERS.
THEY WERE EITHER SENDING NO CASHING RELATED HEADERS AT ALL OR THEY WERE SENDING HEADERS
THAT WERE NONSTANDARD OR OBSOLETE AND ONLY WORKED IN CERTAIN BROWSERS.
SO THE FIRST THING I'M GOING TO DO IS SHOW YOU A COUPLE OF THE PAGES WHERE WE FOUND
INFORMATION CASHED TO DISK AND WHAT IT WAS.
THEN I'M GOING TO LOOK AT SOME OF THE HISTORY AS TO WHY THIS HAS BEEN SO INCONSISTENT AND
HOW SOMEBODY COULD GET CONFUSED.
OVER WHETHER THIS HAPPENS AT ALL AND HOW TO PREVENT IT.
AND I'M SORRY, BUT OUR SPEAKER IS A FIRST TIME SPEAKER AT DEF CON.
HE'S EVER SPOKEN BEFORE.
AND HE CAME UP AND HE SINCERELY ASKED US.
PLEASE.
PLEASE DO NOT INTERRUPT MY TALK.
SO WE'RE NOT GOING TO INTERRUPT HIS TALK.
WE'RE JUST GOING TO SIT UP HERE QUIETLY AND HAVE A DRINK.
AND HERE'S TO DEF CON.
CHEERS.
CHEERS.
THANK YOU ALL FOR COMING.
THANK YOU.
THANK YOU.
THANK YOU.
THANK YOU.
THANK YOU.
THANK YOU.
THANK YOU.
THANK YOU.
THE FIRST THING I'M GOING TO SHOW YOU IS SOME OF THE INFORMATION WE FOUND
IN THE DISC CACHE.
AND THEN GO OVER SOME OF THE HISTORY ABOUT WHAT BROWSERS USED TO DO, WHAT THEY
DO NOW, WHAT THE STANDARDS SAY.
AND WHAT PEOPS' IMPRESSIONS ARE WHEN YOU GO OUT ON THE INTERNET ON WEB SITE LIKE
STACK OVER FLOW AND LOOK FOR THIS.
THEN I'M GOING TO GO OVER A COUPLE OF RECOMMENDATIONS FOR HOW WE THINK IT CAN
BE MORE SECURE.
SO STARTING WITH SOME EVIDENCE HERE.
here. ADP is a very popular payroll processing company. Does anybody here have their check
done with ADP? Lots of you. So we had someone at ISE who had previous history in ADP. He
logged in to their web interface and looked at a payroll statement. We found that it was
cached and it had nice information there, like last four digits of the social security
number and last four digits of his bank account number, which might be used for authentication
purposes on other sites. So you can see how could this possibly go wrong. ADP was sending
some caching headers, but they were nonstandard and obsolete and they were only interpreted
today by IE. So if you went to this site in Firefox or Chrome, this was left behind. Another
site was Argus, which processes pharmacy claims for health insurance companies. You may not
have heard of Argus, but in Maryland our Blue Cross Blue Shield company uses Argus to handle
their pharmacy claims.
And we logged in to the health insurance and went over to Argus to see the pharmacy
claims and without any caching headers at all were the name of the patient and what medications
they were on and what the dosage was. So that may not be the best thing to have sitting
on your hard drive. And this was sent with, once again, no caching headers, so even IE
would cache this.
Our final one might be a little more surprising is Equifax, which does credit reports. After
one of our analysts at ISE, one of our analysts at ISE, who was a member of the ISE team,
Our final one might be a little more surprising, is Equifax, which does credit reports. After one of our analysts at ISE, one of our analysts at ISE,
went to Equifax and accessed his credit report. It was cached. And this includes information
such as the obvious credit score and name. But also a credit report by definition has
a list of all the accounts that you have reported to the credit reporting agency. And
if you've applied for new credit recently or checked your credit report, they often
use questions such as it looks like you have a mortgage from three years ago, what's the
payment and stuff like that, which you could get from the credit report.
So here's a full list of the 21 sites we found that had some form of caching issues.
Some of them big names, banks, others not so big. But it's a pretty big spectrum of
different sites. And here is some of the types of data we found in the cache. Some
of it not so severe, like name, then others more concerning, like date of birth, last
four digits of SSN. A private label department store credit card had full account numbers,
and they don't use expiration dates.
So that's not good. VINs for your auto insurance and so on.
So before I go into what all these nonstandard headers are that these sites may have been
using, if at all, how do you just prevent disk caching in all the browsers that are
popular today? And that's with these two headers. Don't use them as meta tags. There is some
historical precedent for being able to do that, but it's not reliable. Pragma no cache
is the old nonstandard header that goes all the way back to the mid-20th century. It's
from the mid-nineties, when SSL was first introduced. And you need to pass that header
due to a special case with Internet Explorer 8 and earlier when the server is speaking
HTTP 1.0, as opposed to 1.1. That sounds like a little edge case, but I will get to
that in a little bit.
For all other cases, including IE9 and later, cash control no store is what's in the HTTP
standard, specifically for this purpose. They talk about preventing information from being
revealed on backup tapes.
which is the same concept. So what are some of the headers that we saw that don't work
and fail? So cache control, no cache, that's in the standard, but it is about preventing
a user from seeing information that is stale. It says to the browser, you have to go revalidate
this before you use it out of your cache to serve another request. It has nothing to do
with security. But despite that, when Microsoft first implemented support for it back in IE4,
they decided to interpret it with the same meaning as cache control, no store. They stayed
with that for a while all the way through IE9. Then in IE10 they started following the
standard. So this is something that is still changing up until today. Pragma no cache is
an obsolete header that predates HTTP 1.1 and it has the meaning to IE, if this is over
SSL, don't write it to the disk cache at all. And it still works in IE.
Cache control, no cache, that's in the standard. It has the meaning to IE, if this is over SSL, don't write it to the disk cache at all. And it still works in IE.
Cache control private we actually saw on a handful of these sites. It's not intended
for web browsers at all. It is about caching proxy servers that are accessed by multiple
users and it says this information is specific to one user. You shouldn't use it to serve
another request by a different user. Cache control and meta HTTP equivalent tags
does not work. The Pragma header does, but there's some buggy behavior about it and we
have more detail about that on our white paper on the website. So if meta cache control does
not work, at least for the purpose of preventing disk caching.
And finally passing the cache control no store header when the server is using HTTP 1.0,
it's ignored by IE 8 and earlier. And that seems a little weird. Why would a server
speak a header that it doesn't understand because it's too old? Actually, until very
recently the Apache mod SSL, SSL support would automatically downgrade a connection from
version 1.1 to 1.2.
If IE was the browser that requested it. This was to work around a bug in persistent
connections in IE 5. And it was still there until two years ago it was patched in the
main branch of Apache. That change has still not percolated down to all the various Linux
distributions including the latest copy of CentOS. So there are many servers out there
that still have that behavior of downgrading to 1.0 including the demo site that I'm about
to show later.
So that's a little weird behavior. You would never realize that unless you actually followed
through, have a site send that header and look in your cache.
So adding to this confusion as to what works and doesn't work today, things were different
in the past. The first browsers like early versions of Netscape when SSL came out didn't
cache anything that came over HTTPS. And even some later browsers followed that and
even today Safari still works that way.
A server sending something over SSL, it's never cached and the server has no way to
mark something as nonsensitive to make an exception to that.
Firefox did briefly experiment with that in version 3, that being allowing a server to
mark certain things as nonsensitive and I call that an opt‑in policy. They actually
used the header cache control public as a hint to say go ahead and cache this. I know
it's over SSL, but this is just a CSS file or something. It's okay. Then there are a
other browsers, like older ones especially, that only allow the pragma no cache header
as a way to mark individual resources as not to be written to the disk cache. And I call
that nonstandard opt out because nonstandard behavior works and standard headers don't.
Then IE as it came into newer versions started to support cache control no store. But the
pragma no cache support was still there. So I call that generous opt out. Be generous in
what you accept as a browser, which is often applied to other parts of HTML rendering and so
on. I have the three versions of IE listed separately here because they have individual
variants on that policy. But the main idea is still there. Old behavior, new behavior, both
works. And finally, newer browsers such as Chrome and Firefox 4 and later, either the
server sends cache control no store or the browser sends cache control no store. And the
browser or it gets cached, period, the end. And because of this discrepancies between
browsers, there's also a lot of confusion out there in the community about what they
really do. If you go on your search engine on Google and search for either of these phrases,
like browsers do not cache SSL or browsers do not cache HTTPS, you will find results,
some of them new, some of them old, telling you that web browsers don't cache things to
disk if they came over SSL.
Some of them from Stack Overflow, blog posts, mailing lists, even a W3C mailing list. And
that may have been true when it was written, especially some of them that say, well, except
for IE, browsers don't cache SSL. But that's not true today. Chrome and Firefox especially.
In fact, this quote below comes from the OWASP application security fact. Somebody who should
know better, right? And it says if a web page is delivered using SSL, no content can be
cached. And this may have been true with Firefox 2 and earlier if those were the browsers
you were looking at. But that part of the standard is just not there. There's no specific
behavior that all browsers follow as far as SSL.
So let's look at the browser developers who decided, like Mozilla, let's change our caching
policy. After all, that would increase performance. Well, on the bug that was entered into Mozilla
from opt in to strict standards compliant opt out, one of the comments on the bug said
among sites that don't use cache control no store, the correlation between SSL and sensitive
is very low. And those 21 sites, it doesn't work that way. So where do we go from here?
We have a lot of sites out there on the Internet and a lot of browsers that are interpreting
something very differently. Should browsers assume the website will take care of marking
things as sensitive?
Use headers that they think mark it as sensitive or they don't realize they need to be doing
it in the first place. So what do we think should be done?
Well, first of all, the obvious thing is fixed web applications. After all, the HTTP
standard does say this is the header to use if you want to prevent disk caching. So in
the long run, cross browser compatibility is about more than the latest HTML5 tag and
the semantics of XML HTTP requests.
It's also deeper.
Deeper meanings of the HTTP standard that could have security consequences such as
disk caching.
I have fixed browsers as a maybe because it could be reasonably said by a browser vendor
that the standard says you send this if you want to prevent caching. However, at the minimum,
browsers should interpret that pragma no cache header. Despite it being nonstandard,
it did have that meaning back in IE and even in earlier versions of Netscape when they
briefly experimented.
If they are willing to do that, they could go a step further and switch from opt out
to opt in. A browser is not required to cache anything. They could go back to that Firefox
3 policy where nothing is cached and less the server says cache control public.
More importantly maybe, the bad documentation out there that says browsers don't cache this
or use pragma no cache or use cache control no cache. This is the most nonstandard part.
that should be fixed. Obviously we can't fix mailing list archives but at least wikis and
other things like that should be updated and anyone doing security assessments of web applications
should be aware of this issue. Finally, and this is probably the most controversial and
least likely to happen, maybe the HTTP standard should take this into account. In fact, if
you look at RFC 2616 and you search it for SSL, there's a grand total of one occurrence.
And if you search it for HTTPS, there are no occurrences. So while that might make for
a nice layered architecture where the protocol is on top and encryption is underneath, you
shouldn't be ignoring security consequences or assumptions that people are making, especially
when there's this historical behavior among different web browsers. And finally, I've
actually put an HTTPS site out there that tries different combinations of caching headers
so that you can go back and look at it and see if there are any problems with it. And
there's no need to go back to your disk cache and look at it and see what really happens
when you try. So before I bring up that site just as an
experiment, does anyone have a question if you want to go to that microphone there?
If not, I'll go to the demo site. Okay. Yeah, so Safari was on that list as not caching
at all. And the mobile version works the same. Chrome was on the list, it's not caching.
there as strict standards compliant opt out and Android browser works the same. So it's
possible on an Android device that these things could be getting cached to disk just like
the desktop version. And it's a little harder to replace your browser on a mobile device
too. There is a slightly older version of the slides on the DVD but all the content
is the same. I can't hear the question, it's something
about PDF? How does it ‑‑ the fact that it's a PDF file, how does that affect the caching?
So it's possible that due to the fact that when it's a PDF file if they're using the
Adobe
Adobe plug‑in. It's probably, depending on the implementation, maybe caching it to
a temporary file anyway. But we tried it in Firefox and sending the cache control no store
does work on a PDF in Firefox, especially with a new built‑in reader. Yeah?
Okay. So he's asking if you are a web browser user, can you reconfigure it to go back to
the old policy or just not cache HTTPS. In Internet Explorer there is, under advance,
there's an option for do not save encrypted pages for disk. In IE 10 and 11 that is supposed
to work but in earlier versions there were problems with not being able to download files
over HTTPS if that was enabled. In Firefox there's a hidden browser preference that I
have in our white paper on the website that you can set the opposite way to go back to
the no cache SSL policy.
For Chrome we tried writing an extension but didn't get anywhere with the APIs they
provided and Safari you have nothing to do because it doesn't cache in the first place.
All right. Yeah?
Would it potentially make sense to actually encrypt the cache on this?
Possibly, but where do you store the key? Because you want to be able to exit the browser
and go back in and you've got to recover the key. And if you can recover the key, then
so can any other application potentially.
Could you potentially just have it as a one‑type key that's lost in the browser?
Or just use a memory cache.
But that's valid for like a mobile phone where there's not as much memory.
Okay. We're ready for the demo site. Okay.
All right. So I'm in Firefox. The first thing I'm going to do is clear the cache so that
this is valid.
All right. Zero bytes. And next I'm going to visit our test page. And you can do this
also if you would want to. Once it loads here ‑‑ it worked earlier. I even have
an ETC host entry. All right.
So what this is doing, it's a main page with a little description of what the issue is,
how to check for it. And then I have some I frames down here linking the pages that
have been configured on the server to send various combinations of these headers. And
I have a small explanation of what they're supposed to do.
So after you visit this page, I'm going to close the browser just so we're guaranteed
that it's in the disk cache. And then go back in. And we'll go to the magic URL. All right.
And it shows all the disk cache entries. And I've named those files on the demo site
so that it tells you what headers it's sending. So no headers.html is not sending any cache
related headers. There's no cache control or pragma. Then cache control no cache. I'm
proving that this does not work. All right. So it might be influencing the decision of
whether to validate it or not.
Using it. But it has nothing to do with disk caching. Cache control no store is a meta
tag. It's not there in the headers. It's in here if you look at the hex dump. That
doesn't work either. And it probably shouldn't work because for meta tags to affect caching,
you have some weird condition where you can cache it and then parse it or parse it and
then cache it. And then it would be some weird buggy behavior. And, in fact, if you look
at IE.
Who does support the meta tag for pragma headers, there are documentation of bugs telling you
if it's over a certain size and put the meta tag at the bottom of the page or something
like that. Crazy. Okay. And pragma no cache is not working here in Firefox. That would
have worked in IE. And one more is cache control public, cache control private. And cache control
private doesn't work either. So you can go to this demo site and various browsers and
check the behavior and prove it to yourself.
And closing, any more questions?
Somebody? Yeah.
MALE SPEAKER 6
Exactly.
MALE SPEAKER 7
MALE SPEAKER 8
MALE SPEAKER 9
MALE SPEAKER 10
MALE SPEAKER 12
MALE SPEAKER 15
We haven't tested that.
The best thing would be the justification in Firefox as to why they changed it between
version 3 and 4.
Maybe some of that discussion somebody has done some statistical testing on.
.
Exactly.
That's why they changed it in the first place.
So as long as the site does the proper behavior, we'd be okay.
But that's not happening right now.
This is Firefox 22.
And it hasn't changed then.
Anything further?
Okay.
Yes.
Thank you.
Thank you very much.
.
